Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new frequencies command to show term frequencies #48

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

mikix
Copy link
Contributor

@mikix mikix commented Jun 26, 2024

This shows a count of each text/label combo, with a combined total for all annotators. It also flags when the same text has been labelled differently.

Also:

  • Delete some unused code, notably the old unused term_freq code.
  • For the mentions command, don't show duplicate mentions - it's just noise.
  • For the mentions command, sort output by text (rather than using the order that the text appears in the document - which makes less sense when we're removing duplicates)

image

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added

This shows a count of each text/label combo, with a combined
total for all annotators. It also flags when the same text has
been labelled differently.

Also:
- Delete some unused code, notably the old unused term_freq code.
- For the `mentions` command, don't show duplicate mentions - it's just
  noise.
- For the `mentions` command, sort output by text (rather than using
  the order that the text appears in the document - which makes less
  sense when we're removing duplicates)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removing some unused code that I noticed.


# Helper method to add all the info for a single annotator to our table
def add_annotator_to_table(name, label_to_text: dict) -> None:
nonlocal has_term_confusion

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first time i've seen this keyword - i get the approach, i think it makes sense in this use case, but it makes me itchy.

maybe i just need time to come around on it, like our friend the walrus.

@mikix mikix merged commit a515d18 into main Jun 27, 2024
2 checks passed
@mikix mikix deleted the mikix/frequency branch June 27, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants